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Abstract 



C/3 

O 

jl^.,^, Self Consistent Normal Mode Analysis (SCNMA) is applied to heme c type cytochrome 

i-Q , f to study temperature dependent protein motion. Classical Normal Mode Analysis (NMA) 

O-t. 

assumes harmonic behavior and the protein Mean Square Displacement (MSD) has a lin- 

*vj ', ear dependence on temperature. This is only consistent with low temperature experimental 
K* I results. To connect the protein vibrational motions between low temperature and physiolog- 
ist I ical temperature, we have incorporated a fitted set of anharmonic potentials into SCNMA. 
^r ' In addition, Quantum Harmonic Oscillator (QHO) theory has been used to calculate the dis- 
• ' placement distribution for individual vibrational modes. We find that the modes involving 
(^ ' soft bonds exhibit significant non-Gaussian dynamics at physiological temperature, which 
^"^ ' suggests it may be the cause of the non-Gaussian behavior of the protein motions probed 
by Elastic Incoherent Neutron Scattering (EINS). The combined theory displays a dynami- 
, J cal transition caused by the softening of few "torsional" modes in the low frequency regime 
5_j ■ (< SOcm^^or < 6meVor > O.Gps). These modes change from Gaussian to a classical distri- 



bution upon heating. Our theory provides an alternative way to understand the microscopic 
origin of the protein dynamical transition. 



*ewp@purdue.edu 



1 Introduction 

Protein function is determined by both structural stability and flexibility. The stability is needed 
to ensure appropriate geometry of the protein, while the flexibility allows function to proceed at 
an appropriate rate. Quantitative measurements of the temperature dependent atomic mean square 
displacements (MSD) are possible by neutron scattering m |[2ll[TT]| and Mossbauer absorption 18] |9j 
[TOl . All of these experiments show a "dynamical transition" in hydrated proteins, which is marked 
by an abrupt MSD increase in the temperature range 160-240K. It is believed that this dynamical 
transition is correlated with protein function. Three prominent examples are the myoglobin-CO 
binding kinetics |[T2]| . electrostatic relaxation in green fluorescent protein 1131 . and the Arrhenius 
behavior of the electron transfer rate above the dynamical transition temperature. However, the 
time scale and the forms of the functionally important atomic modes remain a subject of active 
discussionHmillTl. 

Numerous theoretical studies of protein dynamics have been carried out by molecular dynam- 
ics (MD) simulations HH [B] [TS [13 and normal mode analysis (NMA) |[T6l [TSl fWl IM UTI. 
NMA requires the use of Maxwell-Boltzmann or Gaussian distributions to describe the proba- 
bility distributions of individual atoms or chemical bonds. Recently, several authors focused on 
the study of the non-Gaussian behavior of the total elastic incoherent neutron scattering (EINS) 
profile from a protein above dynamical transition temperature ||4l|22l|23. It should be noted that 
the distribution of all-atom MSDs from an EINS profile can still be non-Gaussian even if all 
atoms individually exhibit Gaussian dynamics. The Gaussian distribution, which is the ground 
state probability distribution for the quantum harmonic oscillator, is an appropriate approximation 
when hco > kT(a> > lOOcm'^). In the Gaussian distribution, the atom has maximum probability in 
the equilibrium position. We find that in all self consistent theories, the use of a Gaussian distribu- 
tion results in a molecular structure that will tend to be more rigid than what would be found by a 
more exact quantum approach. From Newton's second law, the classical harmonic oscillator (low 
frequency) has highest probabihty at the edges of the well because the atom moves most slowly 
near the classical turning points, which is contrary to the Gaussian or ground state probability 
distribution. The exact quantum behavior of low frequency modes would approach the classical 
displacement. In this paper we explore the role of incorporating the higher quantum vibrational 
states. This shows a softening of the structure in the correct temperature range. 

The material studied by SCNMA is six-coordinate heme c type cytochrome f li26il . The iron 
normal modes are compared with the Nuclear Vibrational Resonance Spectroscopy (NRVS) spec- 
trum Il29l . NRVS is uniquely capable of displaying the low frequency vibrational displacement 



spectrum of the Fe atom at the center of the heme as it sees all modes and can give quantitative 
values for displacements. It is then possible to define low frequency heme modes that are in agree- 
ment with observation with greater accuracy. This is a much more stringent test than most Raman 
comparisons as Raman displacements cannot be calculated with any accuracy. 

SCNMA incorporates non-linearity into harmonic calculation by thermal-statistically averag- 
ing the curvature of the bond potential energies. Because vibrational modes that are not over- 
damped are detected by Raman and IR, one expects the effective Hamitonian to be approximately 
harmonic. SCNMA should therefore be a valid approach. The SCNMA formulation arises from 
a variational procedure that finds the best effective harmonic Hamiltonian by minimizing the free 
energy. This method is described in detail elsewhere OTl . It has been successfully employed on 
models with multiple hydrogen stretching bonds such as the helix melting, conformational change 
in DNA and drug-helix stability, etc |[32l - |[35l . In those papers a Gaussian distribution was used 
to describe the displacement distribution. In this paper, we will further develop this method to 
incorporate non-Gaussian distributions into our calculation. 

2 Quantum Harmonic Oscillator (QHO) theory applied to internal 
atomic bonds 

2.1 The displacement distribution of the internal atomic bonds 

For a bio-molecule with N atoms and M internal atomic bonds M is much larger than N. Stan- 
dard NMA will give us 3N-6 non-zero normal modes. Their frequencies can be written as to = 
[a)i,a>2, ..., cojN-^]. The total MSD for frequency co can be written as 

^^ m,r, >= ^cotK^ (1) 

Subsequently, the temperature dependent total mean square amplitude for the one single inter- 
nal bond is the sum of all normal mode amplitudes, which can be written as 

where D^ is the total mean square amplitude over all frequency modes, D^ is the mean square 
amplitude contribution and d^j is the zero point mean square amplitude for frequency o), and I^^^^P is 
the projection of the normalized eigenvectors at eigenvalues (frequency) a» onto the mass-weighted 



internal coordinates. These amplitudes can represent a linear distance (for stretching bond) or an 
angular twisting (for angle bend and dihedral bond). 

From QHO theory, the harmonic displacement distribution for one particular internal bond at 
frequency a> can be written as 



where u^^ is the displacement variable for mode co, //„ is the Hermite polynomial, and ^n is the 
probability distribution for the «''' excitation state. Here we note that the ground state < M(^|//|4'o > 
is in fact a Gaussian. The corresponding quantized energy levels are 

En = noj(n + -) (4) 

From the Boltzmann distribution, the displacement distribution of this internal coordinate for mode 
CO can be written as 

fojiUcj) ^ — — T (5) 

V CO — 

The joint probability density function for co = [(1)1,0)2, -.-, CO3N-6] can be subsequently written as 

g{Uoji,Uoj2, ...) = Y\ foji^oj) (6) 

The total displacement is m = 2^^ m^. Using a transformation of variables 

\U — y ^ M^, M^2 ~ ^1^2, M^3 = M^j, ...J (/) 

the total displacement distribution can be obtained as 

Xco poo ., 

• • • I g{u- ^Uojj, Uoj2,Uoj3, ...)du^^duoj^ ■ ■ ■ M^3„_5 (8) 

CO *7 — CO -^ 

To reduce unsystematic errors, m^, is chosen to have the largest amplitude of the 3A'^ - 6 
modes. Equation 8 requires 3A'^ - 7 integrals of a 3A'^ - 6 multi- variable function to calculate the 
actual displacement distribution of one single internal bond. For one standard NMA calculation, 
(3A'^ - 7) X M integrals are solved. To reduce the required calculation time, approximation methods 
are employed, as introduced in the next section. 



2.2 The displacement distribution for single frequency harmonic motion and an 
approximation method 

To understand the approximate temperature and frequency behavior of non-Gaussian distributions, 
that of a single frequency normal mode displacement is shown in Fig 1. It shows the temperature 
dependent single frequency displacement distribution at 300K for (a) cli < 50cm'^ (> 0.61 ps), (b) 
50cm"i < 6j < 80cm"i (0.42ps-0.67ps), and (c) oj > 80cm"i. 
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Figure 1 : Characterization of the single frequency displacement distribution at 300K In (a) the 
displacement is similar to the classical distribution, (c) the displacement distribution is similar to 
a Gaussian, and (b) is a cross between the two 



Fig 1 shows the displacement probability for a single frequency, but the displacement for a 
single bond is a superposition of many such frequency contributions with different amplitudes. 
The spread in amplitudes comes from the projection factors (li'^.P) from equation (5) which come 
from the eigenvectors of the various modes. Even for low frequencies, any bond amplitude would 
be the sum of many distributions like those in Fig 1, all at different amplitudes from the origin. 
The central Limit Theorem (CLT) supposes that a large sum of this kind will add up to a Gaussian 



distribution. This assumption has been central to all previous calculations using SCNMA. The 
situation could be quite different, however, if only a few low frequency modes dominate in the 
displacement of particular bonds. In such a case, for some range of temperatures, the displacement 
probability could resemble the plot in Fig l-(a). We emphasize that the hydrogen bond stretching 
modes are typically above lOOcm"^ and fall into the Gaussian distribution regime. The bond 
modes that are softer than the hydrogen stretching bonds, i.e. the torsional motions, may exhibit 
non-Gaussian behavior at physiological temperature. All proteins have torsional modes and this 
effect may be manifested in many proteins. 

From equation (5), the displacement distribution of the single frequency mode is approxi- 
mately Gaussian when 

aj{T) > Q.llTcm-\TinKelvin) (9) 



and more classical when 



a){T) < 0.llTcm~\TinKelvin) (10) 



From equations (9) and (10), the single frequency mode in the frequency regime < SOcni"' (or 
< 6meV or > 0.1 ps) will transition from a Gaussian to a more classical distribution upon heating 
from low temperature to room temperature. It should be noted that the prominent "Boson peak" 
(1 -3.5meVorlO-30cm~') from neutron scattering |l3][lT]|36l or the "doming mode" fromNRVS 
Il29l and IR Il44l experiments lie in this frequency regime. 

To simplify the calculation, we use the assumption that the sum of the independent Gaussian 
variables is still a Gaussian and we treat all the normal modes above 80cm~' as one Gaussian 
distribution. Based on CLT, we can further simply the low frequency displacement distribution 
calculation. If the displacement u for one internal bond is comprised of many low frequency 
modes, we can treat it as a Gaussian. To test how many significant low frequency (< 80cm~^) 
modes are needed to be able to use the Gaussian approximation without loss of accuracy, several 
NMA and subsequent displacement distribution calculations were run on the heme core. We 
found less than 5% deviation from Gaussian in the distribution of u (equation 8) when u has more 
than 5 low frequency modes each accounting for more than 10% of the total potential energy. 
Implementing these two approximations reduce our calculation time by a factor of more than 100. 

3 Method 

An initial classical NMA calculation was performed on the six-coordinate heme c type cytochrome 
fusing the CHARMM force field ||27ll2Sl- An all-atom model |i2Sl was constructed from the X- 



ray coordinates (PDB identifier lEWH II26II ). The model was subjected to force field minimization 
until the root mean square gradient of the potential energy was less than 0.0001 prior to performing 
a standard normal mode calculation with the VIBRAN facihty in CHARMM[27|. 

The low temperature CHARMM force field was refined by comparison with the Nuclear Vi- 
brational Resonance Spectroscopy (NRVS) spectrum 1291 . The method of force field refinement 
process was described elsewhere |[37l [38l [391 . The anharmonic functional forms were chosen from 
reference BOl . in which Morse function, harmonic cos function and dihedral cos function were 
used to describe bond stretch interactions, angle bend interactions and torsional bond interactions. 
The resulting low temperature force constants can then be used along with data on atom distances 
and bond strength to fit anharmonic potential parameters. SCNMA was employed to allow ex- 
ploration of temperature dependent changes in force constant and thermal expansion effects 1311 . 
This method has been described in detail elsewhere 1311 [32l l33l l34l |35^ . where the Gaussian ap- 
proximation was used for the displacement variable u. The only difference here from the previous 
SCNMA is the explicit inclusion of non-Gaussian distributions for low-frequency modes. Here, 
we give a brief description of the computation: 

• Input the effective force constants (the 1 " iteration uses the force constants refined to exper- 
imental data) into the NMA and find the initial normal mode eigenvalues and eigenvectors. 

• Calculate each internal coordinate's total mean square amplitude D^ and each normal mode 
contribution D^ 

• Calculate each internal coordinate displacement distribution f(u) 

• Calculate a new set of the effective force constants. 

• Iterate to self consistency. 

The calculation converged within 20 iterations. 

4 Result and discussion 

Fig 2 shows the comparison between the iron Vibrational Density Of State (VDOS) obtained from 
classical NMA and the NRVS experimental results. Good agreement is achieved over a wide range 
of frequencies, which indicates a useful choice for the low temperature limit force field. Here, we 
give a summary of the general results: (1) below 80cm~' are mostly iron-out-of-plane motions; 



(2) the 80 - 300cm"^ region has both iron in-plane and out-of-plane features; and (3) > 300cm~^ 
are mainly iron in-plane motions. If the calculations did not include anharmonic effects, the total 
displacements would be linear in temperature. Classical NMA results show that at low temperature 
(< 150^), the iron out-of-plane MSD is about three times the iron in-plane MSD despite the fact 
that the iron in-plane motion has two degrees of freedom versus the single degree of out-of-plane 
motion. 
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Figure 2: Comparison of the experiment (from reference pQl) and theoretical (from classical 
NMA) cytochrome f iron Vibrational Density Of State (VDOS) Solid black line: experiment; 
dotted red line: theory 



Fig 3 and Fig 4 show the cyt f iron total MSD from SCNMA. These results are in general 
agreement with the Mossbauer absorption experimental results conducted on other heme proteins. 
The high frequency (> 200cm~^) normal modes are softened on an average of 1 - 2%. This is 
because the high frequency normal modes are dominated by covalent stretching bonds which have 
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Figure 3: The iron MSD vs. temperature plot for various heme proteins. Blue line: iron MSD 
of cyt f by classical NMA; Blue cross and blue dashed line: iron MSD of cyt f from SCNMA 
with Gaussian distribution approximation. Red star: iron msd of cyt f from our SCNMA by 
implementing non-Gaussian displacement distribution. Blue diamond: iron MSD of myoglobin 
by Mossbauer absorption measurement from ref. ISJ. Black upper triangle: iron MSD of cyt c by 
Mossbauer absorption measurement from ref. ||9l 



relatively larger strength and deeper potential wells. Moreover, these high frequency atomic mo- 
tions follow a strict, narrow Gaussian distribution. Fig 4 shows that the iron dynamical transition 
is caused by iron low frequency out-of-plane motions. At lower frequency, the large iron out- 
of-plane motion becomes possible because of the small energy involved in changing the torsion 
angles. As temperature increases, more and more displacement will spread out from the Gaussian 
centroid. This low frequency classical behavior of the atomic displacement distributions coincides 
with the fact that the curvature of the potential function decreases over the distance from the cen- 
troid, which results in the abrupt MSD increase seen in our SCNMA model as compared with 
calculations implementing only Gaussian distributions (Fig 4). 

To analyze the protein flexibility, the force strength defined by Zaccai BH is generally used by 
other authors HJl 



^n = 



clT 



(11) 



From this definition,the iron force strength decreases by a factor of ~5-7. From NMA, the force 
constant is Uq - ma/ and we extract a> to find r^. We found that the dihedral bonds, which are the 
major dynamical element contributing to the iron dynamical transition in our model, are softened 
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Figure 4: The cyt f iron in-plane and out-of-plane MSD from SCNMA. Dashed line: iron in- 
plane MSD from classical NMA; Circle: iron in-plane motion from SCNMA; Solid line: iron 
out-of-plane motion from classical NMA; Star: iron out-of-plane motion from SCNMA 

by only ~ 20%, as show in Table 1. The difference between the two definitions can be explained 
with equation 1. A simple plot of equation 1 (Fig 5) shows that r^ increases exponentially below 
50cm~^. Our results show significant lowering of frequencies in this frequency region. Moreover, 
we found that atoms with internal coordinates associated with soft bonds exhibit a larger MSD 
increase than other atoms in one particular normal mode. 

The MSD spread over frequency increases disproportionally upon heating, as shown in Fig 
6. At temperatures below ~150 K, the iron MSD for the normal mode frequencies that are below 
50cm~l takes about 84% of the total iron MSD, while at 300K it increases to 92%. 

Generally speaking, the normal modes that participate in biochemical reactions should have 
the largest motional amplitudes. The largest amplitude among the iron out-of-plane normal modes 
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Bond Type 


<150K 

(mdyne*A/rad) 


300K 
(mdyne*A/rad) 


Percentage Softened 


Fe-N-C-C 


0.127 


0.096 


24.4% 


N-Fe-N-C 


0.135 


0.111 


17.9% 


Fe-N-C (-C) 


0.062 


0.050 


19.4% 


N-Fe-N (-N) 


0.098 


0.072 


16.3% 



Table 1 : The softening of the iron dihedral force constant from SCNMA 



— normally characterized as the "doming mode" — has been intensively studied experimentally 
H31 - B61 and theoretically 1147 II - 1501 . This mode is Raman inactive In a four-fold symmetric por- 
phyrin. The IR spectroscopy and NRVS of cytochrome f failed to identify a well-resolved mode 
with such character, and with the intensity expected for a heme doming mode in the low frequency 
region. The modes around 40cm~^ and 80cm~^ have been assigned to have the doming features 
by various authors ll46lll49lll5T]| . In our SCNMA calculation, the normal modes 80cm~^ have 
the features of both iron doming motions and in-plane motions. The iron MSD in the frequency 
regime 70 - 90cm~^ takes less than 10% of the total MSD, and these modes are close to Gaussian 
distributions at room temperature from QHO theory. A theoretical study of the doming mode has 
been carried out earlier by Li and Zgierski HTl on a five-coordinated heme model. In the study, the 
doming mode was predicted to be around 50cm~^ and was calculated to be 35cm^^ . Their analysis 
found that the doming mode takes about 90% of the iron MSD at room temperature. In one pre- 
vious NMA calculation, one 37 cm^^ doming mode was found in four-coordinate heme compound 
Fe(OEP), which takes 67% of the total iron MSD (unpublished results). In our six-coordinate cyt 
f SCNMA, three normal modes that have the most iron MSD are (19cm~^ 35cm~' and 49cm~^) at 
low temperature and softened to {I4cm~^,23cm~^ and 37cm"^). These three modes take 63% of 
the iron MSD and increase to 81% at room temperature. We assign them to the doming modes due 
to their significant doming features. QHO theory indicates that these modes are Gaussian distribu- 
tions at low temperature (< lOOK) and more classical at room temperature (300K). As temperature 
increases, these modes develop other features like saddling and ruffling due to the softening of the 
dihedral bonds that are associated with these modes. The energy distribution shows that these 
doming modes are highly delocalized, i.e., the potential energy is distributed among a large num- 



11 



Figure 5: A plot of < Yj'i=i >^ifj > as a function of temperature (T) and (low) frequency co from 
equation (4) 



ber of internal coordinates and the kinetic energy is distributed among a large number of atoms. 
We also observe that the iron low frequency motions are in phase with some other soft bond atoms. 

Besides the doming mode, some other significant water-protein motions are observed in the 
frequency regime below 50cm' ^. These modes are softened by 20 - 50% from low temperature 
to room temperature. These results can also qualitatively explain the two onsets of anharmonicity 
suggested by several authors 151 ll52l as they proposed there are two motional components: one 
happens at T lOOK and one at T 200-230K. As shown in Fig 6, lower frequency modes have a 
relatively lower dynamical transition temperature. 

The statistical properties of fast hydrated protein motions have been analyzed by neutron scat- 
tering PI and X-ray diffraction experiments |i6|. At temperatures below 200K, the displacement 
distribution is statistically a Gaussian. However,a deviation from a Gaussian distribution becomes 
significant at temperatures above 240K. In our SCNMA calculation, below lOOK, the motions 
of individual atoms exhibit Gaussian behavior, but starting from 100 K, the atoms participating 
in soft internal coordinates transition from Gaussian to classical distribution upon heating. The 
percentage of heavy atoms that exhibiting classical behavior rises to 20% at 300K. This result 
agrees with the proposal by other authors who suggest the protein dynamical transition is caused 
by water induced torsional jump BIl ifTTI . Furthermore, we also quantitatively identify that the nor- 
mal modes that contribute to the dynamical transition lie in the frequency regime of < 50cm~^ at 
temperatures below the dynamical transition temperature. 
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Figure 6: Cyt f iron MSD from SCNMA in three frequency regimes: (a) red star < 
20cm~^(2.5meV),(b) black diamond: > 20cm~'(2.5meV)and < 50cm~^{6meV) and (c) blue circle 
> 50cm~^{6meV) 

5 Conclusion 

SCNMA can be used to study temperature dependent protein vibrational motions. In the past, 
all such calculations assumed Gaussian displacement distributions. However, single oscillators 
depart from Gaussian distribution at higher temperatures. This departure from Gaussian behav- 
ior was studied quantitatively here using QHO theory and SCNMA. Our study of heme c type 
cytochrome f has led us to identify some specific features of the atomic interactions which may 
be of general validity. Our results show that only a few normal modes account for most of the 
motional amplitudes of a significant set of bonds. These modes lie in the frequency regime 
< 50cm~^{or < 6meVor > 0.6ps). The higher frequency normal modes essentially maintain a 
narrow Gaussian distribution. Above lOOK, the low frequency modes transition from Gaussian to 
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more classical distributions upon heating, facilitating the softening of dihedral (torsional) bonds, 
which seems to lead to the dynamical transition. 
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